Research on non-time-series data filling methods based on feature evaluation

نویسندگان

چکیده

Abstract With the rapid development of information age, a large amount data is used in popular research areas such as mining. Missing has very serious impact on both process and result mining, it important to find out how fill missing values accurately efficiently. In this paper, we propose method optimally based backpropagation evaluation functions for non-time-series data. Based target value error its own after filling, four classical filling methods, namely mean, interpolation, model prediction, K-nearest neighbor, are considered selection. Finally, single-model padding multi-model weighted schemes compared, results show that with highest fitness selected work best different degrees missingness datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Research On Similarity of Stochastic Non-stationary Time Series Data Based on Wavelet-Fractal

Traditional dimension reduction methods about similarity query introduce the smoothness to data series in some degree, but lead to the disappearance of the important features of time series about non-linearity and fractal. The matching method based on wavelet transformation measures the similarity by using the distance standard at some resolution level. But in the case of an unknown fractal dim...

متن کامل

Time series prediction based on data compression methods

We propose efficient (“fast” and low memory consuming) algorithms for universalcoding-based prediction methods for real-valued time series. Previously, for such methods it was only proved that the prediction error is asymptotically minimal, and implementation complexity issues have not been considered at all. The provided experimental results demonstrate high precision of the proposed methods. ...

متن کامل

Research on Cassandra Data Compaction Strategies for Time-Series Data

Storage and analysis of time-series data is a subject of intense interest in the current international database research field. Time series data, a sequence of collected data information points by fixing time interval, is an important basis to proceed business analysis and prediction in the future. As an excellent NoSQL database, Cassandra is often used to storage time-series data because of it...

متن کامل

Querying Time Series Data Based on Similarity

ÐWe study similarity queries for time series data where similarity is defined, in a fairly general way, in terms of a distance function and a set of affine transformations on the Fourier series representation of a sequence. We identify a safe set of transformations supporting a wide variety of comparisons and show that this set is rich enough to formulate operations such as moving average and t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of physics

سال: 2023

ISSN: ['0022-3700', '1747-3721', '0368-3508', '1747-3713']

DOI: https://doi.org/10.1088/1742-6596/2425/1/012060